Add loader for animovement/aniframe Parquet files by roaldarbol · Pull Request #963 · neuroinformatics-unit/movement

roaldarbol · 2026-04-16T12:09:08Z

Summary

Implements a reader for the aniframe format (Parquet files produced by the animovement R ecosystem), enabling data exchange between the movement Python package and the aniframe/aniread R packages.

Closes #307.

Background

The aniframe format is a long-format tidy data frame (one row per individual x keypoint x time) serialised as a Parquet file with rich metadata stored in the file-level schema metadata. The ecosystem includes:

aniframe -- core data structures (S3 class + metadata spec)
aniread -- readers/writers for many tracking formats including movement's own netCDF

The aniread read_movement() function already reads movement netCDF into aniframe; this PR implements the reverse direction.

Format overview (confirmed from a real aniframe Parquet file)

Columns (three conceptual slots)

Slot	Default columns	Optional columns
what (entity identity)	`individual`, `keypoint`	`model`, `track`
when (temporal)	`time`	`session`, `trial`
where (spatial)	`x`, `y`	`z`, `rho`, `phi`, `theta`
confidence	`confidence`	(optional)

Column types observed: individual as int32, keypoint as Arrow dictionary (categorical), session/trial as int32, time as int32, x/y/confidence as float64.

Parquet metadata

The aniframe metadata is stored in the Parquet file-level schema metadata under the key b'r', serialised using R's native ASCII serialisation format (version 3, starts A\n3\n...). It is not JSON. Fields include:

Field	Type	Default
`source`	string	NA
`sampling_rate`	numeric (Hz)	NA
`unit_time`	factor	`"frame"`
`unit_space`	factor	`"px"`
`unit_angle`	factor	`"rad"`
`reference_frame`	factor	`"allocentric"`
`coordinate_system`	factor	`"cartesian_2d"`
`point_of_reference`	factor	`"bottom_left"`
`variables_what` / `variables_when` / `variables_where`	char vectors	see above

Design decisions

Column mapping: aniframe -> movement

aniframe column	movement	Action
`individual`	`individuals`	Direct -- canonical name
`keypoint`	`keypoints`	Direct -- canonical name
`track`	`individuals`	Rename with warning
`time`	`time` coord	Convert units (see below); preserve original values (aniframe time starts at 1, not 0)
`x`, `y`	`space = ["x", "y"]`	Direct 2D Cartesian
`x`, `y`, `z`	`space = ["x", "y", "z"]`	Direct 3D Cartesian
`confidence`	`confidence` variable	Optional; fill NaN if absent

Extra variables_what/variables_when columns (e.g., model, session, trial): the general rule is whether the extra columns are resolvable -- i.e., they contain only a single unique value and can be safely dropped:

Extra column with one unique value -> drop with an info-level log message
Extra column with multiple unique values -> error out (movement cannot represent hierarchical identity or multi-level time contexts)

A file with session=1 and trial=1 (constants) would load cleanly under this rule.

Polar/spherical coordinates (rho, phi, theta): error out -- movement only supports Cartesian space coordinates.

Metadata mapping

aniframe	movement `.attrs`	Notes
`source`	`source_software`	Forward original source name (e.g., `"SLEAP"`, `"DeepLabCut"`); fall back to `"aniframe"` if `source` is NA
`sampling_rate`	`fps`	Hz = fps, direct
`unit_time`	`time_unit`	Auto-convert all units to seconds; derive `fps` from `sampling_rate`
`unit_space`	`space_unit` (custom)	Preserved as custom attribute
`reference_frame`	`reference_frame` (custom)	Preserved as custom attribute
`point_of_reference`	`point_of_reference` (custom)	Warn when `"bottom_left"` (movement/napari convention is top-left origin)

Time unit handling: automatically convert any unit_time value to seconds and derive fps from sampling_rate where available:

"frame" -> pass raw frame numbers to from_numpy() without fps (time axis = original integers, e.g. starting at 1)
"s" -> already in seconds; set fps = sampling_rate
"ms", "us", "ns" -> divide to seconds; set fps = sampling_rate
"m", "h" -> convert to seconds; set fps = sampling_rate

Scope

Poses only for this PR. Bounding-box support (aniframe with keypoint = "centroid") is deferred.
Load only (aniframe -> movement). The reverse is already handled by aniread's read_movement().

Dependencies

`pyarrow`

Not currently in movement's dependency tree -- not in core dependencies, not in any optional extras. xarray[accel,io,viz] does not pull it in transitively (confirmed by inspection of the installed environment).
Needed for both reading the Parquet file and accessing the file-level schema metadata (where the aniframe R metadata lives).
Should be added as a core dependency or gated with a runtime check_installed check following the pattern in aniread's check_arrow().

R metadata decoding (`rdata`)

The metadata is R's ASCII serialisation format, not JSON -- it cannot be decoded with standard library tools.
rdata is a pure-Python R serialisation parser. Its only dependencies are numpy, xarray, and pandas -- all already required by movement. This makes it a low-cost addition with no new transitive dependencies.
Alternative: skip full metadata decoding, infer variables_what/when/where from column names (same heuristics aniframe itself uses), and require the user to pass fps/source_software explicitly.

Open questions / items for discussion

pyarrow as core vs optional dependency: Not currently in the dep tree. Add to [project.dependencies], or guard with a runtime check and ask users to install it as needed?
point_of_reference coordinate flip: aniframe defaults to "bottom_left" origin; movement/napari conventionally uses top-left (image coordinates). For now the loader will warn and leave data as-is. Automatic y-axis flipping could be added later.
Metadata decoding strategy: Use rdata (pure Python, no new transitive deps) for full metadata, or infer from column names and accept that source_software/fps/unit_time may need to be provided by the caller?
source_software fallback when aniframe source is NA: use "aniframe", or require the caller to pass source_software explicitly?

Files to create / modify

New files

movement/io/load_aniframe.py -- loader function from_aniframe_file() and internal parsing helpers
tests/test_unit/test_io/test_load_aniframe.py -- unit tests

Modified files

movement/validators/files.py -- add ValidAniframeParquet attrs class
movement/io/load.py -- update SourceSoftware type alias; import new loader
movement/io/__init__.py -- import load_aniframe to trigger loader registration
docs/source/user_guide/input_output.md -- add row to supported formats table
pyproject.toml -- add pyarrow; consider rdata
movement/sample_data.py -- register sample aniframe Parquet file (once added to GIN)

Test plan

Generated with Claude Code

codecov · 2026-04-16T12:21:14Z

Codecov Report

❌ Patch coverage is 93.38235% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.41%. Comparing base (82b117b) to head (425dc68).

Files with missing lines	Patch %	Lines
movement/io/aniframe.py	90.62%	18 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##              main     #963      +/-   ##
===========================================
- Coverage   100.00%   99.41%   -0.59%     
===========================================
  Files           41       42       +1     
  Lines         2815     3087     +272     
===========================================
+ Hits          2815     3069     +254     
- Misses           0       18      +18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Add `ValidAniframeParquet` file validator to `validators/files.py` - Add `from_aniframe_file()` loader in new `movement/io/load_aniframe.py` - Register "aniframe" as a recognised `SourceSoftware` in `load.py` - Import `load_aniframe` in `io/__init__.py` to trigger registration - Add `pyarrow` and `rdata` as core dependencies in `pyproject.toml` - Add 57-test suite in `tests/test_unit/test_io/test_load_aniframe.py` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

roaldarbol · 2026-04-16T12:51:17Z

Sample files needed on GIN

Before this PR can be merged we need at least one (ideally two) real aniframe files on GIN for integration tests. I'll generate these from animovemnt so the embedded R metadata serialisation is exercised end-to-end — the current unit tests cover all edge cases but mock the metadata-decoding step which isn't ideal.

Required: 1 × 2D file

The primary integration test file. It should have:

2D coordinates (x, y) and a confidence column
sampling_rate set to a real value (e.g. 30) — the existing test fixture has NaN here, so fps-from-metadata is currently never tested on real data
source set to a known software name (e.g. "SLEAP" or "DeepLabCut")
point_of_reference = "top_left" — so the load completes cleanly without triggering the flip warning (that warning path is already covered by a unit test)
At least 2 individuals and 2 keypoints
~10–20 frames — keep it small for fast downloads
session and trial columns each holding a single value — confirms the single-value-drop path runs on real metadata

Nice to have: 1 × 3D file

A small 3D file (x, y, z) to verify 3D support against real aniframe output. The 3D array-building path is unit-tested synthetically but has never run against a file produced by R.

What you don't need separate files for

The following are all covered by the synthetic unit tests and do not need dedicated GIN files: track → individual rename, polar coordinate rejection, multi-value session/trial error, missing metadata, and the bottom-left origin warning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace triple nested loop with itertools.product and an inner function. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

roaldarbol · 2026-04-16T14:06:31Z

So , outstanding things that have cropped up:

animovement allows users to convert time to pretty much anything, us, ms, s, min, h, etc. (same goes for spatial units), and then the unit is recorded in the metadata. For now, I assume that you would like to just get seconds? The current solution is to have a lookup table and convert on import.
the reference frame in animovement is almost always bottom_left - I think I'll need to add a metadata field that allows converting back to original top-left (which is what we often get from the trackers themselves) so it can also be converted back to top-left for import into movement.
there are some warning in _decode_aniframe_metadata from the rdata package that I have silenced (it's just that the package doesn't know the aniframe class (which makes complete sense).
extra column names: for extra columns, e.g. computed variables such as "speed", should they just be named as-is, or should they be prefixed to avoid namespace clashes? I'm leaning towards as-is - simply because that's what I would expect as a user.
as mentioned in the comment above, we need some aniframe parquet files on GIN. Once I know exactly what should be tested, I can make a minimal set of them and @niksirbi can upload them?
and finally, the dependencies - do we add pyarrow as a required dependency?

- Use metadata variables_what/when/where to classify DataFrame columns instead of hardcoded frozensets; missing fields raise ValueError - Extra columns not belonging to any aniframe category are inferred to their minimum xarray dimensions and added as Dataset variables - Constant extra columns are logged at INFO and skipped - Numeric extra columns stored as float64; others as object dtype - Add _extract_meta_vars, _infer_extra_dims, _build_extra_array helpers - Update _resolve_columns to use metadata vars and return extra col list - Expand test suite from 57 to 71 tests covering all new code paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Per maintainer guidance: tracks are interpreted as individuals by design. Users who need to stitch tracks should do so before loading. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

niksirbi · 2026-04-17T09:32:21Z

Thanks for the PR @roaldarbol! So excited that this is finally happening.

I'll do a first high-level pass at this PR next week, after which I'll get in touch with you to get sample data files.
After I put them on GIN, you can write some tests using them, and I can have another review pass after that.

Sounds like a plan?

roaldarbol · 2026-04-17T09:44:47Z

Sounds good to me!

Adds an `extra_var_dims` floor parameter so callers can ensure extra data variables always carry specific dimensions (e.g. `"individuals"`) even in single-individual/single-keypoint files where auto-inference would otherwise collapse that axis away. Accepts a plain string or a tuple of strings for convenience. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Extract _normalise_extra_var_dims helper to reduce cognitive complexity of from_aniframe_file back within the C901 limit. Shorten two test docstrings to stay within the 79-character line limit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Picks up the typer dependency from the typer branch to pre-empt the conflict when that branch merges to main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

roaldarbol · 2026-04-22T12:14:01Z

For extra columns, I have made the function automatically find the minimal number of dimensions needed to describe it so it automatically gets the correct dims (e.g. if you have the area of the individual, then in animovement all keypoints will have the identical value - and when imported into movement, it would resolve to be a time, individual variable.

The edge case is for single individual/single keypoint cases where extra variables would resolve to just time - to ensure that extra columns be attributes to the correct dimensions I just added an extra argument (extra_var_dims) that allows the user to override it and e.g. say extra_var_dims = "individual" if they should all be that. It also allows specification on a per variable basis:

extra_var_dims={
                      "temperature": ("time",),
                      "bbox_area":   ("time", "individuals"),
                      "speed":       ("time", "individuals"),
                  })

niksirbi

I've made a first pass at this @roaldarbol.

Here are my comments, mostly focusing on design decisions and the questions you'd asked. I'v also left several inline comments, but have not gone much into implementation details (those can be handled later).

pyarrow / rdata dependencies. Pyarrow is a heavy native package , so forcing it on all users for one file format is out of proportion. The good news is that both packages are also available on conda-forge (that's a hard requirement for us). My recommendation would be to gate both pyarrow and rdata behind a new aniframe optional extra. We can also do a runtime check on top of that, as you suggested, with a clear error message ("install with pip install movement[aniframe] or conda install -c conda-forge pyarrow rdata") if someone tries to use the new loader without the requisite dependencies. On movement's conda-forge feedstock (I update that upon release), I will add both packages under run_constrained:, not run:. That keeps the default conda install lean, and still protects against version incompatibilities if a user installs them separately. This matches the pattern used by other scientific Python projects (including xarray) for their heavy optional backends. We will need to also update the installation guide accordingly.
point_of_reference = "bottom_left" handling. I'm fine with just emitting a warning for now, but you may also want to consider doing the y-flip by default. The reason is that with "bottom_left" the data will not correctly overlay on a video/frame as is. So if we intend to use our napari GUI as a viewer of aniframe .parquet files (would be neat!), auto-converting to movement's "top_left" convention would make things easier. In the near future, it would enable someone to drag and drop a video, followed by the parquet file into napari.
Metadata decoding strategy. The rdata package is lightweight and the full-metadata
path is much better UX than "pass fps / source manually". Keep as-is, but make it part of an optional extra dependency (see point 1).
source_software fallback when source is NA. Let's not require a user to pass source_software. I'm fine with falling back to either aniframe or None (as the code currently does).
Time-unit handling. Converting everything to seconds is the right default with fallback to frame units when that's not possible (I think your implementation already takes that approach).
Extra-column namespace. Unprefixed is fine and matches user expectations; this is the right default.
Sample data on GIN. The 2D + 3D spec in the comment thread looks right. Feel free to share the files we me and I will put them on GIN. For each .parquet file, also fill our a metadata.yaml entry as [described here](https://movement.neuroinformatics.dev/latest/community/contributing.html#metadata-yaml-example-entry). For the 2D file, also provide a sample frame at minimum, and optionally the corresponding video if you can share it. This will make it possible to user-test the napari overlay.
Loading aniframe files in napari. If we want the .parquet files to be also load-able into napari, you will have to at minimum updated the SUPPORTED_POSES_FILES constant in movement/napari/loader_widgets.py. See also point 2 above.

roaldarbol · 2026-04-28T19:30:55Z

Thanks for such a detailed review @niksirbi! Will go through it later this week!

Before I do anything else, I'm trying to look for lightweight alternatives to pyarrow as it would be preferable to keep the deps obligatory if at all possible - would something like fastparquet, which only also adds ramjam + fsspec, be small enough? Other alternatives are duckdb or polars.

Edit: From the fastparquret docs:

March 2026. The release of pandas 3.0 has broken a number of things in fastparquet. Since pandas now depends explicitly on pyarrow, there is no longer any demand for the existence of this project, and it is being retired. Perhaps use will continue for those still using pandas 2.x, but we anticipate no further development.

For the auto-flipping y-axis, I need to implement something - a metadata field - that keeps the y value, ideally the frame height, but otherwise probably the max. y value from the data (I think that's what I currently use in the readers), which would allow converting between bottom_left and top_left on the fly. And then this value would be pulled in in this reader.

niksirbi · 2026-04-29T16:44:36Z

Before I do anything else, I'm trying to look for lightweight alternatives to pyarrow as it would be preferable to keep the deps obligatory if at all possible - would something like fastparquet, which only also adds ramjam + fsspec, be small enough? Other alternatives are duckdb or polars.

Thanks for looking into alternatives! I spent some time looking into them as well. I think we can exclude fastparquet based on the note you've posted. Regarding pandas' dependency on pyarrow, an initial read of that note is slightly misleading: pyarrow isn't included among pandas' core dependencies, it's gated behind parquet extras, see the pyproject.toml. But in any case, that fact is encouraging about the stability and trustworthiness of pyarrow; it's the standard go-to choice within the scientific Python ecosystem for reading parquet files, and more established in that context than duckdb or polars.

My reasoning for having it as an optional dependency is that it's heavy-ish and necessary for only a small subset of users (this may change in the future though...). The counter-argument would be to avoid complicating installation instructions.

Before we reach a decision on this, I'd be curious to hear why you think it's preferable to keep the deps obligatory.

For the auto-flipping y-axis, I need to implement something - a metadata field - that keeps the y value, ideally the frame height, but otherwise probably the max. y value from the data (I think that's what I currently use in the readers), which would allow converting between bottom_left and top_left on the fly. And then this value would be pulled in in this reader.

Ah yeah, that makes sense. If you prefer, you're welcome to proceed here without waiting for that, and tackle the auto-flipping in a future PR (opening an issue to avoid forgetting it). Up to you!

roaldarbol · 2026-04-30T09:08:03Z

Ah yeah, that makes sense. If you prefer, you're welcome to proceed here without waiting for that, and tackle the auto-flipping in a future PR (opening an issue to avoid forgetting it). Up to you!

I think I'll get it done for this PR. My thought is basically, at read time, to encode either (1) the frame height or (2) the max(y) as y_height. Then a simple flipping function new_y = h - y where h is y_height and y is the vector of y values, will convert back and forth between top and bottom origin. I'll implement such a function in animovement, should I also make a function for it in this movement PR, or should I just do it at read time - pro's is that it could be used for changing the reference frame going forwards.

If I make such a function, where would you like for it to go?
And do you think y_height sounds like a reasonable metadata field name?

For pyarrow, yeah I also don't think they got it correct - see the discussion of it here: pandas-dev/pandas#54466.

I think my main concern is on the conda-forge side of things, that users need to explicitly add them as extra dependencies. But with no leaner good alternative I think it's just the way it will have to be. :-) In that case, I imagine we should add an informative error, something like:

try:
    import pyarrow.parquet as pq
except ImportError as e:
    raise ImportError(
        "Reading Parquet files requires the optional 'aniframe' "
        "dependencies (pyarrow, rdata), which are not installed.\n\n"
        "Install them with one of:\n"
        "  pip install 'movement[aniframe]'\n"
        "  conda install -c conda-forge pyarrow rdata\n"
        "  pixi add pyarrow rdata\n\n"
        "See https://movement.neuroinformatics.dev/latest/user_guide/installation.html"
        "for details."
    ) from e

EDIT: I actually quite like the tabbed install instructions, so I don't think it'll be an issue, maybe we'd just need to add it there too? https://movement.neuroinformatics.dev/latest/user_guide/installation.html#install-the-package

niksirbi · 2026-05-01T14:21:37Z

I think I'll get it done for this PR. My thought is basically, at read time, to encode either (1) the frame height or (2) the max(y) as y_height. Then a simple flipping function new_y = h - y where h is y_height and y is the vector of y values, will convert back and forth between top and bottom origin. I'll implement such a function in animovement, should I also make a function for it in this movement PR, or should I just do it at read time - pro's is that it could be used for changing the reference frame going forwards.

If I make such a function, where would you like for it to go?

And do you think y_height sounds like a reasonable metadata field name?

Sounds sensible. For this PR, I would just implement this at .parquet read time
There's also an argument for making this a standalone utility inside movement.transforms, but it will be easier to think about that once we have the first (read-time) implementation here.

I think my main concern is on the conda-forge side of things, that users need to explicitly add them as extra dependencies. But with no leaner good alternative I think it's just the way it will have to be. :-) In that case, I imagine we should add an informative error, something like:

In the call today, we decided to have pyarrow as extras for now (we may 'promote' it to core dependency in the future).

I think raising an informative error, like the one you suggested, is a great idea.
And yes, I think we should mention this 'extra' in the tabbed installation instructions (as we do for napari).

Implements the inline and high-level review comments on neuroinformatics-unit#963. Refactor: - Move public `from_aniframe_file` into `load_poses.py` next to other format-specific loaders; keep private helpers in a new `aniframe.py` module (mirroring the `nwb.py` pattern). - Split `ValidAniframeParquet` into `_file_validator + _parquet_validator` and fold the `variables_what/when/where` required-fields check into the validator itself. - Rename `point_of_reference` → `origin` throughout (metadata key, dataset attr, docstrings, tests). Behaviour: - Add `use_frame_numbers_from_file=False` flag so callers can preserve the file's `time` column values (irregular sampling, non-zero start) and convert them to seconds via the now-wired `_TIME_UNIT_TO_SECONDS` lookup. - Drop the `extra_var_dims` parameter; instead auto-expand inferred dims to include any canonical dim that is a singleton in this file, so output shapes stay consistent across files differing only in singleton-dim sizes (e.g. 1- vs 3-individual). - Auto-flip y when `origin == "bottom_left"`. Uses `y_height` from the metadata when present, otherwise falls back to `max(y)` from the data. Sets `ds.attrs["origin"] = "top_left"` and records `y_height` so the flip can be reversed later. The previous bottom-left `UserWarning` is gone — the data is now oriented correctly. Build: - Move `pyarrow` and `rdata` into a new `aniframe` optional extra. Bare `import movement` no longer requires either; the loader (and validator) raise a clear `ImportError` with install instructions only when an aniframe code path is actually invoked. - Update the installation guide with the new extra across all three install tabs (conda-forge, pip, uv) including combined-extras syntax. - Register `aniframe / *.parquet` in the napari plugin's `SUPPORTED_POSES_FILES` so the GUI can browse and load aniframe files. Cleanups: - `_decode_aniframe_metadata` now raises `ValueError` with a clear message on missing `b"r"` key or an undecodable R blob (was log-and-return-`{}`, which produced a confusing two-step error). - Special-case `bool` dtype in `_build_extra_array` (previously coerced to `float64` with NaN fill via `is_numeric_dtype`). - `individual_names` now uses first-occurrence ordering to match `keypoint_names`. - Drop the redundant `or unit_time == "s"` clause in `_resolve_fps`. - Docstring fixes: link to the aniframe spec, example uses `from_aniframe_file` directly, `_resolve_columns` notes "INFO log" instead of "warning". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

roaldarbol · 2026-05-09T21:46:58Z

Alright, now I've spent a few hours going through the comments and addressing them. Here's a few things I think are worth flagging:

In aniframe I renamed point_of_reference to origin, so I renamed this everywhere (metadata key, ds.attrs, docstrings, tests) to fit.
I removed extra_var_dims completely and replaced with the singleton-canonical-dim auto-expansion you suggested.
Now auto-flips y unconditional when origin == "bottom_left". Uses y_height from the metadata when present, falls back to max(y) from the data otherwise, which is how I've also implemented it in aniframe. After flipping, ds.attrs["origin"] is set to top_left and y_height is recorded so the flip can be reversed.
I also added the dependencies as extras in pyproject.toml, do please have a look to see whether it matches the style you'd expect.
_parquet_validator is now a generic factory that mirrors _hdf5_validator. The aniframe-specific @file.validator method on ValidAniframeParquet no longer touches pq directly, but delegates the b"r" key check to _decode_aniframe_metadata, then does the required-fields check on top.

into animovement-reader # Conflicts: # pyproject.toml

SonarQube flagged the direct == checks on ds.attrs["y_height"] in the new bottom_left → top_left flip tests. Switch to pytest.approx, matching the existing fps assertions in the same file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sonarqubecloud · 2026-05-09T21:59:49Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

roaldarbol · 2026-05-11T14:20:28Z

Additional point that we should discuss:

aniframes can contain multiple trials or sessions or other such temporal variables (variables_when). How should movement handle that, if at all? I wonder whether we automatically could return a list of xarrays? Or should it fail?

niksirbi · 2026-05-14T08:38:35Z

Thanks for the updates @roaldarbol. Have been busy with other projects this week but I will have another pass on this next week latest. I'll also think about the sessions/trials question.

niksirbi · 2026-05-14T08:41:04Z

Btw, it's looking like #973 is going to be merged ahead of this one, so you will be able to do away with the singular-plural conversion in this PR.

roaldarbol · 2026-05-22T10:24:21Z

Follow-up for the when issue.

If there are more than one video (so multiple values in when variables), then we give an error.
We can supply a parameter (let's session_keys, very not decided) that takes a dict with the name of the variable and the value, e.g. {session: 1, trial: 3}.
For the error message, provide the names of the when variables - and maybe also potential values?
Otherwise encourage users to split up on write from aniframe (needs its own issue on aniread).

WIP: animovement/aniframe reader (design phase)

4564731

roaldarbol and others added 3 commits April 16, 2026 14:56

Use pytest.approx for floating-point fps assertions in aniframe tests

911f752

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Extract _r_scalar_to_python to reduce cognitive complexity

a1e9339

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Reduce cognitive complexity of _minimal_df test helper

011bbce

Replace triple nested loop with itertools.product and an inner function. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

roaldarbol and others added 3 commits April 16, 2026 16:17

Downgrade track→individual rename from warning to INFO log

4035bf9

Per maintainer guidance: tracks are interpreted as individuals by design. Users who need to stitch tracks should do so before loading. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add animovement/aniframe to supported formats table in docs

b7c18e8

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

roaldarbol marked this pull request as ready for review April 16, 2026 14:30

roaldarbol and others added 4 commits April 22, 2026 13:40

Add typer>=0.9.0 to dependencies

cea06b7

Picks up the typer dependency from the typer branch to pre-empt the conflict when that branch merges to main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' into animovement-reader

ba7e6ef

roaldarbol mentioned this pull request Apr 23, 2026

Stand-alone function for creating output CSV file OCTRON-tracking/OCTRON-GUI#64

Open

Merge remote-tracking branch 'upstream/main' into animovement-reader

92963f9

roaldarbol mentioned this pull request Apr 28, 2026

Dev EthoML/VAME#214

Merged

9 tasks

niksirbi requested changes Apr 28, 2026

View reviewed changes

Merge branch 'main' of https://github.com/neuroinformatics-unit/movement

c8318d8

into animovement-reader # Conflicts: # pyproject.toml

Conversation

roaldarbol commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Format overview (confirmed from a real aniframe Parquet file)

Columns (three conceptual slots)

Parquet metadata

Design decisions

Column mapping: aniframe -> movement

Metadata mapping

Scope

Dependencies

pyarrow

R metadata decoding (rdata)

Open questions / items for discussion

Files to create / modify

New files

Modified files

Test plan

Uh oh!

codecov Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

roaldarbol commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Sample files needed on GIN

Required: 1 × 2D file

Nice to have: 1 × 3D file

What you don't need separate files for

Uh oh!

roaldarbol commented Apr 16, 2026

Uh oh!

niksirbi commented Apr 17, 2026

Uh oh!

roaldarbol commented Apr 17, 2026

Uh oh!

roaldarbol commented Apr 22, 2026

Uh oh!

niksirbi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

roaldarbol commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

niksirbi commented Apr 29, 2026

Uh oh!

roaldarbol commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

niksirbi commented May 1, 2026

Uh oh!

roaldarbol commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud Bot commented May 9, 2026

Quality Gate passed

Uh oh!

roaldarbol commented May 11, 2026

Uh oh!

niksirbi commented May 14, 2026

Uh oh!

niksirbi commented May 14, 2026

Uh oh!

roaldarbol commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

roaldarbol commented Apr 16, 2026 •

edited

Loading

`pyarrow`

R metadata decoding (`rdata`)

codecov Bot commented Apr 16, 2026 •

edited

Loading

roaldarbol commented Apr 16, 2026 •

edited

Loading

roaldarbol commented Apr 28, 2026 •

edited

Loading

roaldarbol commented Apr 30, 2026 •

edited

Loading

roaldarbol commented May 9, 2026 •

edited

Loading

roaldarbol commented May 22, 2026 •

edited

Loading